Improving Efficiency of Incremental Mining by Trie Structure and Pre-Large Itemsets
نویسندگان
چکیده
Incremental data mining has been discussed widely in recent years, as it has many practical applications, and various incremental mining algorithms have been proposed. Hong et al. proposed an efficient incremental mining algorithm for handling newly inserted transactions by using the concept of pre-large itemsets. The algorithm aimed to reduce the need to rescan the original database and also cut maintenance costs. Recently, Lin et al. proposed the Pre-FUFP algorithm to handle new transactions more efficiently, and make it easier to update the FP-tree. However, frequent itemsets must be mined from the FP-growth algorithm. In this paper, we propose a Pre-FUT algorithm (Fast-Update algorithm using the Trie data structure and the concept of pre-large itemsets), which not only builds and updates the trie structure when new transactions are inserted, but also mines all the frequent itemsets easily from the tree. Experimental results show the good performance of the proposed algorithm.
منابع مشابه
روشی کارا برای کاوش مجموعه اقلام پرتکرار در تحلیل دادههای سبد خرید
Discovery of hidden and valuable knowledge from large data warehouses is an important research area and has attracted the attention of many researchers in recent years. Most of Association Rule Mining (ARM) algorithms start by searching for frequent itemsets by scanning the whole database repeatedly and enumerating the occurrences of each candidate itemset. In data mining problems, the size of ...
متن کاملFrequent Pattern Mining using CATSIM Tree
Efficient algorithms to discover frequent patterns are essential in data mining research. Frequent pattern mining is emerging as powerful tool for many business applications such as e-commerce, recommender systems and supply chain management and group decision support systems to name a few. Several effective data structures, such as two-dimensional arrays, graphs, trees and tries have been prop...
متن کاملSmart frequent itemsets mining algorithm based on FP-tree and DIFFset data structures
Association rule data mining is an important technique for finding important relationships in large datasets. Several frequent itemsets mining techniques have been proposed using a prefix-tree structure, FP-tree, a compressed data structure for database representation. The DIFFset data structure has also been shown to significantly reduce the run time and memory utilization of some data mining ...
متن کاملA new incremental data mining algorithm using pre-large itemsets
Due to the increasing use of very large databases and data warehouses, mining useful information and helpful knowledge from transactions is evolving into an important research area. In the past, researchers usually assumed databases were static to simplify data-mining problems. Thus, most of the classic algorithms proposed focused on batch mining, and did not utilize previously mined informatio...
متن کاملBasic Framework of CATSIM Tree for Efficient Frequent Pattern Mining
Finding frequent patterns from databases have been the most time consuming process in association rule mining. Several effective data structures, such as two-dimensional arrays, graphs, trees and tries have been proposed to collect candidate itemsets and frequent itemsets. It seems that the tree structure is most extractive to storing itemsets. The outstanding tree has been proposed so far is c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computing and Informatics
دوره 33 شماره
صفحات -
تاریخ انتشار 2014